Quickstart¶
This notebook will cover all the basic and most useful functionality available to get a user up and running as fast as possible.
Installation¶
Installation can be done via a pip install:
pip install remotemanager
for the most recent stable version.
However if you would like the bleeding edge version, you can clone the devel
branch of the git repository:
git clone --branch devel && pip install remotemanager
Function Definition¶
remotemanager
executes user defined python functions at the location of choice. Below is a basic function example which will serve our purposes for this guide.
Important
The function must stand by itself when running, so any imports or necessary functionality should be contained within.
[1]:
def multiply(a, b):
import time
time.sleep(1)
return a * b
Running Remotely¶
This function would run just fine on any workstation, but to run something more complex we would need to connect to some more powerful resources for this.
remotemanager
provides the powerful Computer
module for this purpose:
[2]:
from remotemanager import Computer
First, we must define a “template”. This is the base from which a submission script will be generated.
The easiest way to create one of these templates, is to acquire a jobscript that you know works for your machine. A few suggestions for this:
Machine documentation may have an example script to build from (or even a configurator!)
If you have already run jobs, your own scripts should suffice, otherwise a colleague may have a example for you
The helpdesk may be able to assist you in creating a jobscript for your use case
In this example, we will be taking an existing jobscript that we know works.
We will also parameterise just a single option, #username#
. This syntax allows Computer to provide a “dynamic” input that can be changed.
The basic syntax for parameterisation is that anything between double #hashes#
will be treated as a parameter and added to the computer. Here, for example, a variable called “hashes” would be created.
Important
Parameters will be sanitised to all lowercase. Therefore #ARG#
== #arg#
.
Important
Parameters must not clash with internal names, an error will be raised in this case. For example, we have to choose #username#
here instead of #user#
, since user
is already an internal argument.
Note
This is covered in greater detail in the dedicated tutorial
[3]:
template = """#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --cpus-per-task=4
#SBATCH --time=00:30:00
#SBATCH --job-name=quickstart
#SBATCH --account=#username#
#SBATCH --partition=boost_usr_prod
#SBATCH --qos=normal
export OMP_NUM_THREADS=4
module load python/3.10.8--gcc--11.3.0
"""
Now, create a Computer. At a minimum you should specify:
Host address (or
userhost=user@host
)The submitter that your job system uses. (defaults to
bash
, which will run on the login node)
[4]:
connection = Computer(
user="user",
host='remote.hpc.url',
submitter="sbatch",
template=template
)
# note that template arguments must be specified after initialisation
connection.username = "myuser"
This example connection is pointed at an imaginary user@remote.hpc.url
. However, this uses your ssh
configuration, so you are able to connect to a machine in the same way that you would from a command line.
For example, if there existed a machine which you connected to with ssh machine
, then you are able to create a computer using:
connection = Computer("machine")
Important
Computer
requires that you are able to ssh into the remote machine without any additional prompts from the remote. For connection difficulties regarding permssions, see the relevant section of the introduction.
Tip
The connection parameters inherit those from your ssh config. So if you are able to ssh <host>
, you can create a Computer
with Computer("<host>")
.
Tip
Before using Computer
for the first time on a machine, any immediate problems can be discovered by testing a basic command. Start with a simple ssh user@remote "ls"
and see what comes back. If the terminal returns a sensible output without prompting for a password, a Computer
should function as expected.
Now we have a connection ready to go, we can see an example of the script that would be produced:
[6]:
print(connection.script())
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --cpus-per-task=4
#SBATCH --time=00:30:00
#SBATCH --job-name=quickstart
#SBATCH --account=myuser
#SBATCH --partition=boost_usr_prod
#SBATCH --qos=normal
export OMP_NUM_THREADS=4
# module load python/3.10.8--gcc--11.3.0
Remote Commands¶
With the concept of this remote connection
, we can excecute commands and (more importantly) our function on this machine.
For commands, url provides a cmd
method, which will execute any strings given
[7]:
connection.cmd('echo "this command is executed on the remote"')
[7]:
this command is executed on the remote
Running Functions¶
For function execution, we require a Dataset
.
Note
Think a Dataset
as a container for a function.
Like URL
, this can be imported directly from remotemanager
:
[8]:
from remotemanager import Dataset
To create a dataset, pass your function to the Dataset
constructor.
Note
When passing a function to the dataset, do not call it within the assigment. For example, call Dataset(function=multiply)
not Dataset(function=multiply())
Here we are additionally specifying the local_dir
and the remote_dir
, which tells the Dataset where to put all relevant files on the local and remote machines, respectively.
Note
We will use skip=False
in the Dataset creation, otherwise the Dataset
will see the dataset we previously created and import its data rather than create itself anew.
[9]:
ds = Dataset(function=multiply,
url=connection,
local_dir='temp_local', # Location where files will be "staged", before sending to the remote
remote_dir='temp_remote', # Location on the remote server where the run will be executed
skip=False
)
Important
This dataset has no runs, as it is just a container for the function multiply
. For this, we must add runners.
Creating runs¶
To add runs, we use the Dataset.append_run()
method. This will take the arguments in dict
format, and store them for later.
You may do this in any way you see fit, the important part is to pass a dictionary which contains all ncessary arguments for the running of your function:
[10]:
runs = [[21, 2],
[64, 8],
[10, 7]]
for run in runs:
a = run[0]
b = run[1]
arguments = {'a': a, 'b': b}
ds.append_run(arguments=arguments)
appended run runner-0
appended run runner-1
appended run runner-2
Running and Retrieving your results¶
Now we have created a dataset and appended some runs, we can launch the calculations. This is done via the Dataset.run() method
Once the runs have completed, you can retrieve your results with ds.fetch_results()
, and access them via ds.results
once this is done
Important
fetch_results()
does not return your results, but collects the files and stores them within the runners.
[11]:
ds.run()
Staging Dataset... Staged 3/3 Runners
Transferring for 3/3 Runners
Transferring 9 Files... Done
Remotely executing 3/3 Runners
[11]:
True
Wait¶
Calculations can take time, we can add an optional wait
call here to await the dataset completion.
The first number is the check interval
, the second is the maximum wait time (set to None
for an indefinite wait).
[12]:
ds.wait(1, 10)
Now the run has completed, we must fetch the results before they are made available:
[13]:
ds.fetch_results()
Fetching results
Transferring 6 Files... Done
Results have been fetched from the remote, now we can access them.
[14]:
print(ds.results)
[42, 512, 70]
[15]:
ds.errors
[15]:
[None, None, None]
With this, you have all of the basic tools available to run python functions on a remote machine. See the other tutorials for more advanced usage
Warning
Be aware that on MacOS, you may receive some errors when transferring data. This is most likely due to MacOS natively using an old rsync
version (<3.0.0). More information is available on this page.